NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Convergence and error control of consistent PINNs for elliptic PDEs

Bonito, Andrea; DeVore, Ronald; Petrova, Guergana; Siegel, Jonathan W (March 2025, IMA journal of numerical analysis)

Free, publicly-accessible full text available March 26, 2026
Sharp lower bounds on the manifold widths of Sobolev and Besov spaces

https://doi.org/10.1016/j.jco.2024.101884

Siegel, Jonathan W (December 2024, Journal of Complexity)

We consider the problem of determining the manifold $$n$$-widths of Sobolev and Besov spaces with error measured in the $$L_p$$-norm. The manifold widths control how efficiently these spaces can be approximated by general non-linear parametric methods with the restriction that the parameter selection and parameterization maps must be continuous. Existing upper and lower bounds only match when the Sobolev or Besov smoothness index $$q$$ satisfies $$q\leq p$$ or $$1 \leq p \leq 2$$. We close this gap and obtain sharp lower bounds for all $$1 \leq p,q \leq \infty$$ for which a compact embedding holds. A key part of our analysis is to determine the exact value of the manifold widths of finite dimensional $$\ell^M_q$$-balls in the $$\ell_p$$-norm when $$p\leq q$$. Although this result is not new, we provide a new proof and apply it to lower bounding the manifold widths of Sobolev and Besov spaces. Our results show that the Bernstein widths, which are typically used to lower bound the manifold widths, decay asymptotically faster than the manifold widths in many cases.
more » « less
Full Text Available
On the expressiveness and spectral bias of KANs

Wang, Yixuan; Siegel, Jonathan W; Liu, Ziming; Hou, Thomas Y (January 2025, Internation Conference on Learning Representations (ICLR) 2025)

Kolmogorov-Arnold Networks (KAN) \cite{liu2024kan} were very recently proposed as a potential alternative to the prevalent architectural backbone of many deep learning models, the multi-layer perceptron (MLP). KANs have seen success in various tasks of AI for science, with their empirical efficiency and accuracy demonstrated in function regression, PDE solving, and many more scientific problems. In this article, we revisit the comparison of KANs and MLPs, with emphasis on a theoretical perspective. On the one hand, we compare the representation and approximation capabilities of KANs and MLPs. We establish that MLPs can be represented using KANs of a comparable size. This shows that the approximation and representation capabilities of KANs are at least as good as MLPs. Conversely, we show that KANs can be represented using MLPs, but that in this representation the number of parameters increases by a factor of the KAN grid size. This suggests that KANs with a large grid size may be more efficient than MLPs at approximating certain functions. On the other hand, from the perspective of learning and optimization, we study the spectral bias of KANs compared with MLPs. We demonstrate that KANs are less biased toward low frequencies than MLPs. We highlight that the multi-level learning feature specific to KANs, i.e. grid extension of splines, improves the learning process for high-frequency components. Detailed comparisons with different choices of depth, width, and grid sizes of KANs are made, shedding some light on how to choose the hyperparameters in practice.
more » « less
Free, publicly-accessible full text available January 22, 2026
Nesterov acceleration despite very noisy gradients

Gupta, Kanan; Siegel, Jonathan W; Wojtowytsch, Stephan (December 2024, Advances in Neural Information Processing Systems 37 (NeurIPS 2024))

We present a generalization of Nesterov's accelerated gradient descent algorithm. Our algorithm (AGNES) provably achieves acceleration for smooth convex and strongly convex minimization tasks with noisy gradient estimates if the noise intensity is proportional to the magnitude of the gradient at every point. Nesterov's method converges at an accelerated rate if the constant of proportionality is below 1, while AGNES accommodates any signal-to-noise ratio. The noise model is motivated by applications in overparametrized machine learning. AGNES requires only two parameters in convex and three in strongly convex minimization tasks, improving on existing methods. We further provide clear geometric interpretations and heuristics for the choice of parameters.
more » « less
Full Text Available
Efficient structure-informed featurization and property prediction of ordered, dilute, and random atomic structures

https://doi.org/10.1016/j.commatsci.2024.113495

Krajewski, Adam M; Siegel, Jonathan W; Liu, Zi-Kui (January 2025, Computational Materials Science)

Full Text Available
Equivariant frames and the impossibility of continuous canonicalization

Dym, Nadav; Lawrence, Hannah; Siegel, Jonathan W (July 2024, ICML'24: Proceedings of the 41st International Conference on Machine Learning)

Canonicalization provides an architecture-agnostic method for enforcing equivariance, with generalizations such as frame-averaging recently gaining prominence as a lightweight and flexible alternative to equivariant architectures. Recent works have found an empirical benefit to using probabilistic frames instead, which learn weighted distributions over group elements. In this work, we provide strong theoretical justification for this phenomenon: for commonly-used groups, there is no efficiently computable choice of frame that preserves continuity of the function being averaged. In other words, unweighted frame-averaging can turn a smooth, non-symmetric function into a discontinuous, symmetric function. To address this fundamental robustness problem, we formally define and construct weighted frames, which provably preserve continuity, and demonstrate their utility by constructing efficient and continuous weighted frames for the actions of SO(d), O(d), and Sn on point clouds.
more » « less
Full Text Available
Entropy-based convergence rates of greedy algorithms

https://doi.org/10.1142/S0218202524500143

Li, Yuwen; Siegel, Jonathan W (May 2024, Mathematical Models and Methods in Applied Sciences)

We present convergence estimates of two types of greedy algorithms in terms of the entropy numbers of underlying compact sets. In the first part, we measure the error of a standard greedy reduced basis method for parametric PDEs by the entropy numbers of the solution manifold in Banach spaces. This contrasts with the classical analysis based on the Kolmogorov [Formula: see text]-widths and enables us to obtain direct comparisons between the algorithm error and the entropy numbers, where the multiplicative constants are explicit and simple. The entropy-based convergence estimate is sharp and improves upon the classical width-based analysis of reduced basis methods for elliptic model problems. In the second part, we derive a novel and simple convergence analysis of the classical orthogonal greedy algorithm for nonlinear dictionary approximation using the entropy numbers of the symmetric convex hull of the dictionary. This also improves upon existing results by giving a direct comparison between the algorithm error and the entropy numbers.
more » « less
Full Text Available
Optimal Approximation Rates for Deep ReLU Neural Networks on Sobolev and Besov Spaces

Siegel, Jonathan W (December 2023, Journal of machine learning research)
Sriperumbudur, Bharath (Ed.)
Full Text Available
Characterization of the Variation Spaces Corresponding to Shallow Neural Networks

https://doi.org/10.1007/s00365-023-09626-4

Siegel, Jonathan W.; Xu, Jinchao (June 2023, Constructive Approximation)

Full Text Available
Greedy training algorithms for neural networks and applications to PDEs

https://doi.org/10.1016/j.jcp.2023.112084

Siegel, Jonathan W.; Hong, Qingguo; Jin, Xianlin; Hao, Wenrui; Xu, Jinchao (July 2023, Journal of Computational Physics)

Full Text Available

« Prev Next »

Search for: All records